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ABSTRACT 

Vocabulary systems for individuals using 
communication aids can be broken down into two major types: coverage 
vocabularies and acceleration vocabularies. Acceleratioji vocabularies 
or word sets assume the individual is able to spell and words are 
chosen which speed up the rate of communication. Acceleration 
techniques may include sets of most frequently used words or 
syllables, abbreviation-expansion techniques, word prediction 
techniques, linguistic expansion, and full sentence storage and 
retrieval. This paper presents data regarding the coverage power of 
most-frequently-used-word sets, the effects of word length and 
keystroke analyses on calculated word set efficiencies, and the 
effect of different termination and abbreviation strategies. Word set 
analyses which do not take keystroke analysis into account are shown 
to be overly optimistic due to the heavily skewed distribution of 
word length versus usage. A generic abbreviation algorithm is 
describe \ which allows implementation and comparison of other 
strategies. Fixed-vocabulary, letter-based word prediction techniques 
are shown to be a form of cued abbreviation expansion with a 
relatively low motor efficiency. (Author/DB) 
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Principles involved in the formation of acceleration vocabularies are reviewed followed by an 
analysis of alternate strategies for implementing abbreviation expansion accelleration techniques. 
Data regarding the coverage power of most-frequently-used-word sets, the effects of word length 
and keystroke analyses on calculated word set efficiencies, and the effect of different termination 
and abbreviation strategies are presented. Word set analyses which do not take keystroke analysis 
into account are shown to be overly optimistic due to the heavily skewed distribution of word 
length versus usage. A generic abbreviation algorithm is described which allows implementation 
and comparison of other strategies. Fixed-vocabulary, letter-based word prediction techniques 
are shown to be a form of cued abbreviation expansion with a relatively low motor efficiency. 

KEY WORDS: abbreviation expansion, acceleration, aided communication, prediction, vocabular- 
ies, word frequency, word sets 



Vocabulary systems for individuals using communi- 
cation aids can be broken down into two major types: 
coverage vocabularies and acceleration vocabularies. 
Coverage vocabularies are vocabularies (picture s^ts, 
symbol sets, etc.) which try to provide an individual 
with a set of "words" or "concepts" with which the 
individual can communicate. The objective of this type 
of word set is to try to provide the individual with the 
ability to communicate most effectively and about the 
widest range of topics, given the limited word set. 
These vocabularies are generally provided to an individ- 
ual who cannot spell, and whose communication is 
therefore restricted to this particular vocabulary (picture 
set, symbol set, etc.). 

Acceleration vocabularies or word sets, on the other 
hand, assume that the individual is able to spell. Be- 
cause the users can spell, they are alrpiuy able to say 
anything that they would like. Tfius, tne words chosen 
for the acceleration set are not chosen for their ability 
to allow an individual to communicate their ideas, but 
rather to speed up th^ rate at which they can conr.mu- 
nicate ihem. 

Accelei'ation techniques can take many forms, includ- 
ing sets of most frequently used words or syllables 
(Goodenough-Trepagnier & Rosen, 1982; Harris & Van- 
derheiden. 1980; f^cDonald & Schultz, 1973). abbrevi- 
ation-expansion techniques (Vanderheiden. 1984). 
word prediction techniques (Soede & Foulds. 1986; W. 
Woltosz, personal communication. 1985) linguistic ex- 
pansion (Eulenberg & Rahimi. 1978; Galyas & Hunni- 



cutt, 1984), and full sentence storage and retrieval 
(Baker, 1984; HC Electronics. 1978; Vanderheiden and 
Grilley. 1975). 

Acceleration vocabularies can incorporate the stor- 
age and retrieval of entire phrases or sentences. In 
order for the sentence to be stored in advance, how- 
ever, it must be known in advance and/or be a phrase 
or sentence which is very commonly used. This accel- 
eration strategy (prestored phrases or sentences) is 
most effective for common requests (I'm thirsty," "I 
have . - .," "When do we . . "Can I have a . . etc.) 
and comments (1 don't like that," "Cut it out." "This is 
good." etc.). Because of the rapid speed with which 
these phrases can be called up and played out. they 
can be used very effectively in situations where a 
spelled-out or assembled sentence would take too long. 
They can aiso be used in places where interjections 
must be swift in order to have any effect (e.g.. com- 
munication continuation phrases or emergency situa- 
tions). Thus, the storage and retrieval of full sentences 
is a very powerful communication acceleration tech- 
nique. The vast majority of our communication, how- 
ever, is of a novel type. That is. the vast majority of 
what we say every day is not predictable, and cannot 
be prestored or handled through prestored phrases. 
For this communication, we must rely on acceleration 
techniques that help the individual assemble sentences 
as they occur. It is the "novel communication" acceler- 
ation techniques that are the subject of this paper, 
although the same techniques that are used to call up 
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words can also be used to call up prestored phrases 
or ;>entences. 

In this paper, the basic concepts and issues will be 
presented, along with data regarding the impact (pro 
and con) of various types of accoieration word sets. 
These discussions are used as a basis for an analysis 
of several abbreviation-expansion approaches, as well 
as the relationship of fixed-vocabulary spelling predic- 
tion techniques. 

Most-Frequently-Used Words Strategy 

The first principle underiying most of the acceleration 
word set techniques is the most-frequentiy-used words 
strategy. This strategy focuses on providing the easiest 
access to those words which are used most frequently. 
The reason that most techniques are based on this 
most-frequently-used words strategy is that the nonmal 
word usage is very heavily skewed, with a relatively 
small number of words accounting for most of our 
speech. Rgure 1 shows the word distribution from 
several major spoken and written word studies. In all 
of these cases, it can be seen that a relatively small 
number of words account for a relatively large portion 
of the communication. Typically, the first 50 words will 
account for 40 to 50% of the total words communicated 
even though they account for only 1/2% of a 10,000- 
word vocabulary. One hundred words would account 
for approximately 60%. 200 words 70%, and 400 
words 80% (Berger. 1967; Horn, 1926; Howes, 1966; 
Rinsland. 1945; Sheric, 1973). 



As can be seen, a relatively small number of words 
account for the vast majonty of word usage. It can also 
be seen that once beyond the initial couple of hundred 
words, each additional word adds relatively little. Thus, 
providing quick access to the first 100 words will have 
a much greater impact on the person's speed o com- 
munication than quick access to the next 1 .000 words. 
As a result, most of the acceleration techniques tend 
to focus on ihe most-frequently-used words to optimize 
their effectiveness. 

Consistency of the Words Across Word Studies 

A second important factor is consistency across 
word samples. In examining the abov^ studies, it is 
interesting to note that a relatively small number of 
words accounted for a large portion of each of the word 
samples. What is not clear from Figure 1 is whether it 
is the same words which appear most frequently in the 
different word samples. If the same words do appear, 
then these words .-^ould be used within the acceleration 
technique for an individual with a fair degree of confi- 
dence that the words would be this individual's most- 
frequently-used words as well. If. however, the most- 
trequently-used words were substantially different 
within each sample, then the value of this "most-fre- 
quently-used words" phenomenon would be greatly 
reduced. That is. if the most-frequently-used words 
differed greatly from individual to individual or popula- 
tion to population, then a "standard set" of most-fre- 
quently-used words would not exist, and a special set 
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would need to be calculated for each individual or 
population. 

Research by Vanderheiden. Cress and Kelso (1987) 
indicates that the most-frequently-used words do differ 
between studies. However, there is a fair degree of 
commonality, and a set of 50 to 200 *'frequently-used 
words* can be constructed which will apply across most 
word samples, with only a minimal loss (5-8% and less 
for larger word sets) in covering power from a custom 
'frequently-used words" set Jeveloped from the spe- 
cific study(s). Thus, the general principle illustrated in 
Figure 1 can be used to generate "most-frequently- 
used words" for arbitrary clients. The expected effi- 
ciency, however, would be somewhat less than that 
shown in Figure 1 in actual daily use. 

Effect of Word Length 

A third factor that needs to be considered in analyzing 
acceleration word sets is the effect of v/ord length on 
acceleration word sets. Since a longer word takes 
longer to spell than a shorter word, the longer word 
would be more valuable (based on length times fre- 
quency) than a shorter word that was used just as 
frequently. In fact, a long word which is used less 
frequently may be more valuable than more frequently 
used words which are shorter. Table 1 shows how this 
happens, in this table, the words are ordered by fre- 
quency of use, with the most frequently used words 
first. Ordered in this fashion, the word "because" is the 
56th word in the list. However, if we take into consid- 
eration the length of the word, and calculate the number 
of keystrc kes that this wdrd represents, we find that it 
is nearly twice as valuable as the word "up," even 
thoujgh the word "up" is much higher on the list. Since 
the time it takes to communicate or spell out a message 
on a communtoation aid is more a function of the 
number of letters than the number of words, a "key- 
stroke" analysis would be much more accurate than a 
simple word frequency analysis. Thus, a word set which 
was arranged by length times frequency of use would 
he a more useful tool for acceleration vocabulary than 
would be vocabulary sets based on frequency alone, 
such as those of Berger (1967), Beukelman, Yorkston, 



TABLE 1: Effect of Word Length on value of a Word in a 
Wordset 
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1000 • 
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(with space) 
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per 1000 words 
to type this word 
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think 
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56 


because 


3.65 
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29 



* BtM on Ho«vM <19M) ipoMn word study 



Pobleto and Naranjo (1984). Horn (1926). and Howes 
(1966). 

The length of words, however, has implications be- 
sides just reordering of most-frequently-used word 
sets. It also has significant effects on the calculated 
efficiencies of these word sets. Figure 2 shows the 
decrease in efficiency of most-frequent!y-:ised word 
sets when a keystroke analysis is used rather than the 
simpler word coverage analysis. For example, if we 
look at the 50 most frequently used words, we can see 
that they account for 50.13% of the words, but only 
29% of the letters or keystrokes. Similariy, a 100-word 
set would yield only 38.7% of the characters, versus 
63% of the words. A 400-word set would yield only 
56% of the characters, versus 81% of the words. Thus, 
using a keystroke analysis rather liian a straight word 
frequency analysis decreases the apparent covering 
power of these word sets by between a third and almost 
a half, depending upon the size of the word set. 

This loss can be partially offset by reordering the 
words, so that the most valuable words (bas^ on 
length times frequency) are chosen rather than simply 
the most frequent. This still results, however, in a much 
lower calculated coverage (Fig. 2). 

The reason for this discrepancy between word cov- 
erage and keystroke coverage is that the most fre- 
quently used words are mostly short words, where the 
less frequently used words art longer. Figure 3 shows 
a plot of the length of words as a lunction of their 
usage. In this plot, each bar represents 5% of the total 
word sample of 250,000 words. As can be seen, we 
have to go beyond the ^alfway point before we hit an 
average word length of even 4 characters. 

The implications of this are two-fold. First, looking 
only at the frequency of word use without looking at 
word length can result in a false positive indication for 
the value of the words or word sets. To get an accurate 
picture, the length-times-frequency analysis should be 
used. Further, for an optimum word set, the words 
should be reordered after they have been given a 
length-times-frequency value. However, reordering the 
word set has little impact on the overall efficiency. The 
efficiency would only be increased by 1 to 2%. 

Effect of Scanning Tectiniques on ttie Above 
Calculations 

All of the above calculations are made assuming that 
the alphabet and the words are available on a direct 
selection panel; that is, that each letter takes the same 
amount of time to access (one unit), and that the words 
take the same amount of time to access (one unit). On 
a scanning panel, the calculations become much more 
difficult. Each letter of the alphabet takes a different 
amount of time to access. Thus, the value of a word 
becomes equal not just to the length of the word 
measured in letters, but also to the total length of time 
it takes to select the letters in the word. In order to 
calculate the amount of time saved by putting a word 
on a scanning panel, it would be necessary to (a) sum 
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Figure 2. Percent of total words and keystrokes whfch are accounted for by the most frequently used words. 
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the amount of time it would take to seleoc each of the 
letters to spell the word, (b) subtract the time it would 
take to select the word given its planned location on 
the scanning panel, and then (c) multiply that result by 
the frequency of the word. Without going into any more 
detail here, it is sufficient to say that the rules that go 
into selecting the optimal words for a direct selection 
aid are different from and considerably simpler than 
those that would be used to select the optimal words 
for a scanning panel. The analyses in this paper should 
not. therefore, be applied directly to vocabulary selec- 
tion for scanning panels. The magnitude of the error 
which would be introduced by doing so is not known 
at the present time, but is a topic currently being 
investigated at the Trace Center. 

Abbreviation Expansion Techniques 
Overview 

Outside of the inclusion of words and phrases directly 
on communication boards or scanning aids, the next 
most popular acceleration technique is the use of ab- 
breviation expansion techniques. With these tech- 
niques, each word in the unit's expansion vocabulary 
is represented by a smaller set of letters or keystrokes. 
By having a shorter spelling for each word, it is possible 
to save keystrokes. These techniques are usually used 
with electronic aids which automatically look up the 
shorter spellings and then replace the abbreviation with 
the fully spelled word. Thus, the individual types a short 
sequence of keys which is instantly translated into a 
fully spelled word on the display. 

Abbreviatk)n expansbn techniques can be used with 
words, phrases, or entire sentences. As discussed 
previously, the bulk of everyday communication (in 
excess of 90%) would still need to be constructed by 
asserTiUing words, or phrases and words. This discus- 
sion is limited to word abbreviations. 

Basically, word abbreviation expansion techniques 
amount to providing an alternate, shorter spelling for 
the words. In some cases, these shorter spellings are 
chosen to resemble the original words. For example: 

abv. = abbreviation 
tr = then 
bcs = because 

There is no requirement, however, that the abbrevia- 
tions be spelled in any particular fashion. The word 
"then" coukl have been abbreviated to "Tl," or for that 
matter to "QJ." In general, the abbreviations are chosen 
in order to facilitate learning and recall of the codes. As 
is discussed later, different abbreviation techniques 
have varying implications for the efficiency of the overall 
technique. 

Abbreviation expansion techniques are not neces- 
sarily restricted to the letters of the alphabet. Words 
could be represented by any sequence of keystrokes. 
THN, Tl, [ESCAPE](RETURN], or any other sequence 
of keystrokes could be used to represent a wor J (or 
phrase or sentence). In one technique called Minspeak 
(Baker, 1984), words, phrases, and sentences are rep- 



resented by sequences of keystrokes or selections 
havirg pictures or linguistic characteristics on them. 
Figure 4 shows Minspeak icons in a three-key sequence 
which could be used to store and recall the sentence 
"What time are we going home? Also shown are ex- 
amples of the sequences of keys which would be used 
to store individual words. With Minspeak, pictures and 
grammatical concepts are used rather than letters for 
the codes for words in order to facilitate the recall of 
large r umbers of abbreviations. Further, since the se- 
quences for some of the words follow grammatical rules 
and procedures, it would be possible for individuals to 
figure out the codes for some words from scratch 
without knowing the code for the word in advance. This 
would however, require a thorough familiarity with the 
grammatical constructs used to generate the code. 

Efficiency Calculations of Abbreviation Expansion 
Techniques 

Ignoring human perception, cognitive processing, 
and other cognitive/perceptual factors, it is possible to 
calculate the theoretical motor efficiency for different 
abbreviation-expansion approaches using a fairly 
straightforward process. A calculation which uses the 
length of the word minus the length of the abbreviation, 
multiplied by the frequency of use for the word, could 
be used. As an individual used an abbreviation expan- 
sion technique for a long period of time, his ability to 
recall abbreviations would tend to become more instan- 
taneous and reflexive, in the same way that the key- 
strokes for a word flow out of a touch typist's fingers. 
Eventually the typing of the abbreviation would begin 
to flow as fluently from the indivicual as the spelling for 
the word (or perhaps more fluently for very long words). 
As this occurred, the motor efficiency calculation de- 
scribed would become a fair estimate of the relative 
efficiency of the technique. 



What time are we going home? 




difficult problem the 




(Graphics and examples courtesy of Bruce Baker, 1987) 

Figure 4. Example of Minspeak key sequences for sentences and 
words. 
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Until this level of proficiency is reached, however, 
these theoretically calculated motor efficiency factors 
need to be substantially adjusted to account for recall 
and other cognitive/perceptual factors, as discussed 
below. The length of time it would take to go from 
looking up abbreviattons to reflexive use of abbrevia- 
tfons has not been defined to any degree to date. Rrst- 
order approximatkxis. iiowever. might be obtained by 
considering the limit it takes for an individual to learn to 
use another language fluently. In the case of abbrevia- 
tions, the individual may actually have a simpler task 
than teaming a second language, since he would only 
be learning an alternate spelling for the words, and 
woukl not have to leam a new syntax, sentence, gram- 
mar order, idioms, etc. 



Comparison of Abbreviation Techniques 

Using the above concepts and technques. the rest 
of the paper will exptore different abbreviation-expan- 
sion techniques in both qualitative and quantitative 
form. First, the different abbreviation termination ap- 
proaches are discussed, including relative efficiency 
Issues. This Is followed by a discussion of the different 
abbreviation strategies, with quantitative analysis of 
their relative effider'^ies. Rnally. a discusston of the 
relattonship of abbreviation expansion techniques to 
predictkx) techniques is presented, along with clinical 
implteations of the analyses. The techniques covered 
include: 

Termination Techniques 

• variable-length autoterminating 

• variable-length with terminating period 

• variable-length with terminating "expand" or "ma- 
cro" key 

• variable-length numeral-terminating (10-Branch) 
Abbr6viatk>n Strategies 

• truncation abbreviatton strategy 

• idtosyncratic-logical abbreviation strategy 

• minknum length arbitrary abbreviation strategy 
Prediction Techniques 

• fixed-vocabulary techniques 

• variable-vocabulary prediction techniques 

The most general of the abbreviation strategies is the 
'veriable-lerigth autotenrinatlng' abbreviation expan- 
sion algorithm (Vander^.elden. 1984). With this strategy, 
any sequence of ciiaracters or keystrokes can be used 
as the abbrevia^k)n. After each keystroke, the system 
looks up the cjrrently accumulated sequence of key- 
strokes to see whether it matches any stored abbrevi- 
ations. If a match is found, then the sequence is re- 
moved from the screen and replaced by the expansion 
(word, phrase, or sentence). With this technkiue. it is 
important to avokJ using any abbreviations whk:h are 
identical to the beginning characters for a word. For 
exampte. the word "there" might be abbreviated "TH." 
However, this abbreviatton should be avoided since as 
soon as one began to spell any word which began 
"TH." the expansfon "there" woukl appear. This prob- 



lem is easily overcome by either using two letters which 
do not occur in sequence or by adding a third character 
such as a period, dash, slash, and so forth (i.e.. "TH." 
or "TH-" or "TH/"). The variable-length autoterminating 
approach is the most efficient of the abbreviation ex- 
pansion approaches, due to the fact that all of the other 
abbreviation expansion approaches are subsets of it. 
as discussed below. The "QiJcKey" technique is an 
example of the variable-length autotemriinating abbre- 
viation expansion algorithm. The algorithm is also used 
as part of the Minspeak system. 

The variable-length, period-temriination abbreviation 
technkiue is probably the best known. With this tech- 
nique, any set of characters can be used as the abbre- 
viation without concern for collision with other words 
or abbreviations. For example, the abbreviatton "TH." 
can be used without interfering with the use of the 
words "the." "then." "their." and so forth, since none of 
these words have a period as their third character. This 
technique has the disadvantage, however, of requiring 
an exfra keystroke (the period) for each abbreviation. 
Remembering that the most frequently used words are 
short to begin with, this additional keystroke signifi- 
cantly reduces the efficiency of this abbreviation tech- 
niqL3. The exact deaease would depend upon the 
abbreviations used, but woukJ amount to an approxi- 
mate 30% deaease in efficiency over the use of a 
variable-length auto-terminating approach (which did 
not use a fixed terminating character). The period- 
temiination approach can be seen to be a subset of the 
variable-length auto-tenminating approach by simply 
putting a period at the end of all of the abbreviations 
used with this latter approach. 

Another common approach is to use some "expand" 
or "macro" key to indicate that the previous or following 
character (or characters) should be treated as an ab- 
breviation and expanded. For example, the letter "T" 
might be preceded or followed by a special "expand" 
(or "ALT') key which would cause the letter to be looKed 
up and expanded to "the." Similariy. the letters "TH" 
could be followed by an "expand" key in order to cause 
them to be expanded to "that." The advantage of this 
approach is again that any letters can be used for an 
abbreviation without a fear of collision with other words 
or abbreviations. The disadvantage is that it again adds 
one keystroke to each of the abbreviations, and would 
suffer the same loss of efficiency as putting a period at 
the end of each abbreviation. This technique can also 
be seen to be a subset of the variable-length, auto- 
terminating abbreviation generic approach by simply 
including the "expand" key ac part of the stored se- 
quence of keys which make up the abbreviation. 

Another commercially available abbreviation-expan- 
sion technique is the "10-Branch" abbreviation-expan- 
sion technkjue (Vandertieiden & Kelso. 1984; ZYGO. 
1985). With this technique, all abbreviations consist of 
a series of letters followed by a number. Allowable 
abbreviations would be "T7." "TH8." and so forth. This 
technique also avoids collision with any other abbrevi- 
attons or words, because of the unique letter-number 
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sequence, in addition, since it uses ten different termi- 
nation characters rather than one (as with the "period" 
or "expand" strategies above), an increased number of 
shorter codes is possible. As a result, the decrease in 
efficiency is less, but would be in the range of 5 to 
10%. Again, this technique can be implemented on a 
system supporting the variable-length, auto-terminating 
abbreviation algorithm by using abbreviations which all 
consist of a sequence of letters followed by a number. 

Cued Versus Uncued Abbreviation Expansion 
Techniques 

All of the above techniques can be implemented in 
either an uncued fashion or in a cued fashion, whsre 
the 'current expansion choices are displayed along with 
their abbreviations. For example, with the 10-Branch 
strategy, the 10 abbreviations which could be com- 
pleied with a single keystroke (one of the 0-9 keys) are 
always displayed on the screen. Thus, when the letter 
"T" is struck, the expansions for TO through T9 are 
shown on the screen, next to the number which would 
complete the code for that item. If the person does not 
want one of these choices and types an "H," the screen 
would change to display those words whose abbrevi- 
ations were THO through TH9. Thus, if an individual 
could remember the beginning of the code, the screen 
would automatically cue them, confirming the beginning 
and providing the final numeral of the sequence. 

The use of cuing can greatly slow down the rate of 
communication using an abbreviation expansion tech- 
nique. The abbreviation expansion technique is most 
useful when the user memorizes the codes and simply 
types them in. If. in contrast, an individual types a. 
character, then checks a cue display, types another 
character and then checks the display again, the theo- 
retical abbreviation-expansion rates discussed below 
would not apply. Thus, while a technique which cues 
individuals may bt a helpfUl memory aid, it may signifi- 
cantly slow the rate of communication if the individual 
were to rely on it. especially for commonly used abbre- 
viations. 

Abbreviation Strategies 

In addition to the abbreviation expansion algorithm^? 
discussed above, there are a number of different strat- 
egies that might be used to select abbreviations for 
each word. For exampie. a contraction strategy might 
be used, where abbreviations are formed using first 
and last letters in a word." Another approach is to use 
taincatlon, where the beginning of the word is used. If 
this approach is used, then a final terminating or con- 
firming character is required to differentiate the abbre- 
viation from the actual beginning of other words. Both 
of these techniques (contraction and truncation) have 
an advantage because they follow rules. With rule- 
generated abbreviation strategies, the individual does 
not have to remember the code, since they can con- 
struct the code by thinking about the target word. They 
have the disadvantage of tending to create longer 



abbreviations, which significantly reduces the effi- 
ciency. There is also a problem with multiple collisions 
(e.g., two words having the same abbreviation). 

At the other extreme is an abbreviation scheme which 
uses arbitrary letters and/or characters for the abbre- 
viation. With this approach, there is no direct relation- 
ship between the abbreviation and the expansion. This 
technique has the advantage of achieving maximum 
motor efficiency by using minimum length abbrevia- 
tions. It has a disadvantage due to the extreme difficulty 
of teaming totally arbitrary codes for any large number 
of words, thus requiring extensive practice to master a 
large vocabulary. This approach is primarily useful for 
establishing the maximum possible rate for abbreviation 
expansion in a given word set, assuming the individual 
has mastered the abbreviations. Use of short but arbi- 
trary codes is also a viable technique if applied to a 
small enough word set. For example, it would not take 
an individual who is cognitively able to master spelling 
very long to memorize 10 or 20 arbitrary codes, or 
perhaps as many as 50. Given the facts that (a) a small 
number of words accounts for a very high percentage 
of our word usage, and (b) the most frequently used 
words tend to be quite short, the use of very short but 
somewhat arbitrary codes for some of the most fre- 
quently used words is a viable approach. 

Between these two extremes lies a range of abbre- 
viation strategies which use letters or characters which 
are conceptually 'inked v/ith the target word. Salient 
letters may be used ("BC for "because"), or characters 
that resemble the sound of the word ("2" for "two", "U" 
for "you"), or characters that have other linkages fX" 
for "across", or "X" for "no"). In general, these codes 
would be chosen to b* as short as possible and yet 
easily remembered. Normally, the more frequently used 
words can use more cryptic codes, since they are used 
often and remembering the codes is not difficult (and 
having short codes is very valuable). Less frequently 
used words generally need to have abbreviations that 
are more obvious or easily remembered. Abbreviations 
which are chosen by an individual according to their 
logic might be termed "idiosyncratic-logical". 

Figure 5 shows a comparison of the relative efficien- 
cies of these different approaches, and Table 2 shows 
sample abbreviations from each aporoach. The abstract 
optimized codes show the results if 13 number and 
symbol keys were to be used for single-key abbrevia- 
tions, and all of the other 200 words received two- 
character abbreviations. The alphanumeric optimized 
assumes that letters and numbers could be used, and 
that 1 0 of the number and symbol keys were used in a 
logical fashion to represent words. Th.e alphabetic op- 
timized used only letters of the alphabet for codes. The 
trunction approach used the shortest available abbre- 
viation using the truncation technique. Where several 
words begin with the same sequence, the highest 
ranking word (length times frequency) was given the 
shortest code. The word "I" is counted as being three 
characters long (shift, i. space). The dash (-) is used to 
visually represent a space in the chart. The asterisk is 



9 



Fixed-Vocabulary Acceleration 



203 



4 0 V. 



I 50%. 

CO 

c 

I 

« 
o 

•5 3 0%-. 




2 0%.. /..:••"•'" 




Arbitrary Optimized 

Alphanumeric Optimized (Idio-logical) 

— Alphabetic Optimized (Idic-lcgical) 

— Truncation 



5 0 100 

Abbreviated Words 

Figure 5. Percent of keystrokes eiimtnated as a function of vocahulary size for four abb. eviation expansion approaches. 



200 



ERIC 



used to represent the "EXPAND" or "CONFIRMATION" 
keystroke for the trur.cation approach. 

The data are based upon the first 200 most valuable 
words (length times frequency) based on the Howes 
(1966) study. An excerpt of the data used in this 
analysis is shown (Table 2). The relative efficiencies of 
the different techniques would shift somewhat as the 
number of words in the wordset gets larger. However, 
the need for a tenninating character for the truncation 
(and similar) techniques v^ill always put them at a sig- 
nificant efficiency disadvantage, as shown in Rgure 5. 

The Effect of Collisions 

Collisions (when the same abbreviation would be 
useful for more than one word) occur in all abbreviation 
expansion routines. For example, "TK" would be a 
contraction for "thank," but would also be an contrac- 
tion for "tank" or "trick," and for "truck." Although 
collisions occur with all abbreviation routines, they are 
more of a problem with truncation and contraction 
techniques for two reasorfs. Rrst, words tend to begin 
and end with similar letter groupings. As a result, tmn- 
cation and contraction techniques tend to yield a large 
number of collisions. Second, the general solution strat- 
egy when a collision is encountered is to se»ect some 
other set of letters to represent one of the words. For 
idtosyncratic (individually selected) abbreviation ap- 
proaches, this is not a problem. With contraction or 
truncatfon schemes, however, this counteracts the pur- 
pose of using truncation or contraction in the first place. 
If Sonne words are tmncated by rule but others have 
special codes, then the individual is left not only having 



to remember the special abbreviations, but also to 
remember when to use truncation and when to try to 
remember a code. 

Collision Accommodation 

Even idiosyncratic (individually selected) abbreviation 
techniques can run into collisions. For example, the 
abbreviation "PN" might be used for the word pencil, 
since "PN" does not normally occur as the beginning 
for any word in English. However, there are some 
seldom-used words, such as "pneumatic," "pneu- 
monia," and so forth, which do start with "pn." One 
approach to this problem is, of course, to simply select 
a different abbreviation for the word pencil. However, 
a second strategy would be to include a "re-abbreviate" 
key on the communication aid. The "re-abbreviate" key 
would simply take a word that had been expanded and 
change it back into its abbreviated form. Thus, the 
individual would type "PN," the word "pencil" would 
appear, the individual would hit the re-abbreviate key, 
and the word would change back into "PN," allowing 
the individual to continue typing the word "pneumonia." 
This would add one keystroke to the typing of this 
seldom-used word ("pneumonia"), but would allow the 
individual to use the "PN" abbreviation sequence the 
rest of the time. The re-abbreviate strategy can be used 
to allow individuals to use a large number of short, very 
valuable abbreviations which would otherwise be un- 
usable because of the spelling of some infrequently 
U3ed word. 

One of the most valuable applications of the re- 
abbreviate key is to allow the number keys across the 
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Table 2: Sample of Abbreviations Used in the Abbreviation Comparison Analysis 
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top of the keyboard to be used for the nriost commonly 
used words. The existence of the re-abbreviate key 
allows these keys ^o be assigned to these most fre- 
quently used words so that a single keystroke would 
call up these words. This can yieki a significant increase 
in effkaency. since the 10 most freqt !v used words 
account for more than a ffth of the al words that 
were used, and more than c tenth of the keystrokes in 
normal conversation (Table 2). Although the number 
keys woukJ normally be used for typing these high 
frequeru^y words, the re-abbreviate key still allows the 
numbers to be accessed. For example, to type a num- 
ber (e.g.. "1956"). one would hit the "1" key (which 
woukl cause a word to appear), followed by the reab- 
breviate key (to collapse it back to a "1") and then type 
the '9." "5." and "6" keys. This would add one keystroke 
to the typing of any number (which is infrequently done), 
and removes one keystroke from the abbreviation for 
the top 10 words (or 15 words, if the nearby symbol 
keys are used) each time these highly frequent words 
are used. The use of this capability vith the 10 highest- 
frequency words alone would increase an indivluu'ars 
communication rate by 12 to 15%. and would increase 
the efficiency of an abbreviation expansion routine by 
about 4% absolute (over using 2-letter codes). 

Relationship of Prediction Techniques to Abbreviation 
Expansion Approaches 

Prc'^^^rior techniques are techniques where the com- 
niunicauv '1, beised upon past events, tries to predict 
what the individual's next selection (letter or word) is 
likely to be. There are many different types of prediction 
technkjues. They vary both in temns of what they use 
as input data on which to base their predictions and 
what they predict. 

In addition to varying the type of data they use for 
input and output, prediction techniques can either be 
fixed or dynamic. Rxed prediction algorithms always 
make the same prediction for the same input. For 
example, if the individual types "TH." fixed prediction 
techniques would aiways guess that the word would 
be "THE" when the "T" was typed aid would guess 
"THAT" when the "H" was typed. If the individual then 
typed an "A." the techniques would always guess that 
it will be "THAN." etc. Rxed prediction algorithms have 
the advantage that, once the individual gets used to 
the algorithm, he can anticipate (or predict) what the 
prediction will be. Thus, the individual can type "TH" 
and then hit the "YES" key (to accept the prediction) 
without even having to lock up to see that the algorithm 
has predicted "he word "THAT." The system would 
always predict the word "THAT" for the characters 
"TH." and the individual can be certain of this. 

Dynamic prediction algorithms, on the other hand, 
continually update and change their predictions based 
upon the recent history of word usage by the individual. 
Dynamic systems have the advantage that the predic- 
ttons would theorettoally be r ore efficient, since they 
are based upon the individual's actual usage. They 
have the disadvantage that the user cannot ever predict 
O in advance exactly what the device will offer as a 



prediction. It may offer the word "WHEN" 30 times in a 
row as a prediction for "WH." and then suddenly change 
the prediction to "WHAT." It may then continue to 
predict the word "WHAT." or switch back to the word 
"WHEN" the next time "WH" is typed depending on 
which words the individual has Uoed recently. As a 
result, the individual must always check after each 
keystroke (or after each word, depending on the algo- 
rithm) to see what the predictign is before proceeding 
with the next selection. Depending upon the individual's 
speed in making selections, the way the predictions are 
presented, the location of the prediction display, and 
the prediction algorithm, the time lost in checking the 
predictions may be more than the time saved by not 
having to spell the entire word. 

It can be argued that only dynamic prediction tech- 
niques are truly predictive techniques. Fixed prediction 
techniques are really only cued abbreviaticn expansion 
techniques. Thai is, each word is paired with a unique, 
fixed sequence of selections which will cause it to be 
displayed and selected. Although the user can watch 
the screen where the various words are presented, the 
user can also just remember the sequence of move- 
ments and call the word up and select it without ever 
looking at the piediction display. 

This observation is important, since if fixed prediction 
is in fact a cued abbreviation expansion technique, it 
can often b» made more effective and efficient by 
viewing it ft n this perspective. First, the theoretical 
maximum eftidency of this fixed-vocabulary prediction 
technique would be the same as that of the truncation 
algorithm discussed (and analyzed) eariier. That is. the 
fastest an individual could go using the fixed-vocabulary 
prediction technique would be when they knew the 
predictions so well that they did not need to look at the 
prediction display. As can be i from Table 2 and 
Figure 5, this maximum rat^ would still fall short of an 
abbreviation expansion rate. Adding the cognitive and 
perceptual delays inherent in a display-based prediction 
technique, the efficiency would be even les^ Thus, the 
only benefit of using a prediction technique over an 
abbreviation technique would be the elimination of the 
need to remember the codes. 

A Hybrid System 

In order to take advantage of (a) the strengths of the 
fixed, short, reflex-based abbreviation expansion tech- 
nique with the short nigh frequency words and (b) the 
strengths of the cued at breviation or prediction tech- 
niques, with the large number of seldom used words, 
a hybrid approach may be appropriate for many clients. 
Such an approach would use fixed, 1-. or 2-keystroke 
abbreviations for tne most frequently used words. Fifty 
words encoded in this fashion would oe fairiy easy to 
learn and would provide the individual with the maxi- 
mum speed possible for these most fr -^quently used 
words which account or approximately 50% of the 
words, or 39% of the keystrokes in normal conversation 
and writing. For words beyond the first 50 (or perhaps 
100), a cued abbreviation and/or prediction technique 
could be used. With this technique, the individual would 
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begin to spell the word, and have the device cue him 
with predictions or shorter nethods for achieving their 
desired word. In all cases, the individual could simply 
spell out the word if he felt that was more convenient. 

The exact transition points for moving between one 
technique and anotfier would be a function of the 
individual's selection speed, reorientation time, percep- 
tion rate, etc. For individuals vAXh extremely slow se»' c- 
tion time, it may be that prediction software would oe 
appropriate even at a point less than 50 words. For 
individuals vAXh faster reaction times, a cued but fixed 
abbreviation scheme may be the most appropriate tech- 
nique to use for words 50 to 500. This would allow the 
individual either to use the cuing or to rely on memory 
for words, but it would obviate the need to pay attention 
to the cues over time. 

In any case. It is clear that a specific technique or 
approach is not appropriate for all individuals. It also 
appears that even for a given individual, the optimum 
system may consist of multiple strategies, in the same 
way we use multiple strategies in writing and/or making 
notes for ourselves. 

Conclusion 

Although various abbreviation-expansion strategies 
exist, including variable length fixed tenninator, variable 
length auto tenninator, numeric terminator, expand key, 
and* macro key approaches, all can be Implemented on 
a generic, variable-length, self-temiinating abbreviation- 
expansion algorithm. With any chosen approach, it is 
also possible to choose from different abbreviation 
strategies such as contraction, art)ltrary, or kJiosyn- 
cratic-logical (user-selected) abbreviations. Of these, 
the arbitrary has the potential for the greatest motor 
efficiency. This motor effidency, however, is marginal 
as compiared to the idiosyncratic-logical, and therefore 
has little merit in practice. The difference between the 
idiosyncratic-logical and truncation approaches, how- 
ever, is substantial. Further, the problem v^th collisions 
using the truncation approach results in long abbrevia- 
tions which in some cases are as long as the words 
they represent. As a result, the idiosyncratic-logical 
abbreviatk)n approach can be very effective, particulariy 
when used with the shorter, higher-frequency words 
which make up most of our communication. For words 
beyond the first 50, 100, or 200, however, memorizing 
abbreviations may be a problem, even over the long 
term. In these cases, 'a cued abbreviation-expansion 
technique may be most effective. Probably the least 
cognitively difficult of these would be a cued ibbrevia- 
tion-expansion technique using tnjncation codes for 
the abbreviations. Such a technique would look like 
and behave Id^ntteally to a fixed-vocabulary word pre- 
diction technique. Either single words or multiple ex- 
pansions (predictions) could be displayed in this fash- 
ion. Because of the relative strengths of both of these 
techniques, the most effective and appropriate ap- 
proach may be a hybrid of th 4echnkjues, with the 
Wtosyncratio-togical approach being used for the 
shorter, most frequently used words and a cued trun- 



cation abbreviation-expansion approach (fixed predic- 
tion) for the lower frequency words. 

Address reprint requests to: Gregg C. Vanderheiden. 
Trace Research & Development Center, Waisman Cen- 
ter, 1500 Highland Ave., Madison, Wl 53705-2280. 
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